Designing prosodic databases for automatic modelling in 6 languages
نویسندگان
چکیده
We describe the design and creation of prosodic speech databases for 6 languages. The purpose of the databases is to allow derivation of prosody models in order to improve TTS synthesis. The main prosodic variables to model were word prominence, prosodic boundary strength and phone duration. We describe the database structure and contents and the methodology for creating prosodic databases, and we present statistics for the main prosodic variables.
منابع مشابه
Designing Prosodic Databases for Automatic Modeling of Slovenian Language in a Multilingual TTS System
متن کامل
Linguistic Annotation of Two Prosodic Databases
Two prosodic databases were annotated with linguistic information using SGML (Standard General Markup Language), one database of American English and one of Modern Standard German. Only information that might have prosodic correlates was annotated. Pho-netic and morphological information was supplied by automatic tools and then hand corrected. Semantic and pragmatic information was inserted by ...
متن کاملAutomatic prosodic labeling of 6 languages
This contribution describes a method for the automatic prosodic labeling of multi-lingual speech data. The prosodic labels are word boundary strength and word prominence. The speech signal and its orthographic representation are first transformed to feature vectors comprising acoustic and linguistic features such as pitch, duration, energy, part-of-speech, punctuation, word frequency and stress...
متن کاملThe Recognition of Emotion
To detect emotional user behavior, particularly anger, can be very useful for successful automatic dialog processing. We present databases and prosodic classifiers implemented for the recognition of emotion in Verbmobil. Using a prosodic feature vector alone is, however, not sufficient for the modelling of emotional user behavior. Therefore, a module is described that combines several knowledge...
متن کاملData driven intonation modelling of 6 languages
A method for creating multi-lingual intonation models is described. The method adheres closely to the pioneering work of Traber, in that a recurrent neural network (RNN) predicts a number of F0 values per syllable. An important aspect of the work presented here is the selection of linguistic and prosodic features that are suitable for predicting the observed intonation phenomena in different la...
متن کامل